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Abstract 



> 

o 

CN I A short and elementary proof of the joint convexity of relative entropy 

is presented, using nothing beyond linear algebra. The key ingredients are 

'^^ . an easily verified integral representation and the strategy used to prove the 

^P I Cauchy-Schwarz inequality in elementary courses. Several consequences are 

proved in a way which allow an elementary proof of strong subadditivity 
'j^. in a few more lines. Some expository material on Schwarz inequalities for 

■y I operators and the Holevo bound for partial measurements is also included. 



1 Introduction 

Because the strong subadditivity (SSA) of quantum entropy plays an important 
role in quantum information theory, there has been some interest in simple proofs 
[TK| 1^ . suitable for elementary courses. In this note, we give a self-contained 
proof of SSA, valid for finite dimensional systems, using only basic linear algebra 
and an easily verified integral representation. The basic strategy was used in jH]. 
However, the presentation here, unlike that in ^ and [T3], does not explicitly use 
the relative modular operator. Instead, the simple left and right multiplication 
operations explained in Section 12.11 suffice. Unlike fIU\ OT] not even elementary 
results from complex analysis are used. 

* Partially supported by the National Security Agency (NSA) and Advanced Research and 
Development Activity (ARDA) under Army Research Office (ARO) contract number DAAD19- 
02-1-0065, and by the National Science Foundation under Grant DMS-0314228. 



The state of a quantum system is described by a density matrix p, i.e., a pos- 
itive semi-definite matrix satisfying Trp = 1, The entropy of a quantum state 
represented by density matrix p was defined in 1927 by von Neumann j211 I2H1 as 



•^Ip) = -Trplogp. (1) 

The property of SSA arises when the relevant quantum system is composed of 
subsystems so that Pabc is a density matrix on a tensor product space of the form 
Ha^'Hb'^'Hc, and the partial trace is used to define the reduced density matrices, 
Pab = TtcPabc and pb = Ti a Pab = Ttac Pabc, etc. The SSA inequality [TOJ is 

S{pabc) + S{pb) < S{pab) + S{pbc)- (2) 

Many applications of SSA use closely related properties of the relative entropy 

H{P,Q) = -TiP{\ogP-logQ) (3) 

which is well-defined for positive sem-definite P, Q whenever ker(Q) C ker(P) pro- 
vided that we define P(logP— logQ) = on ker(P). A description of the properties 
of S{p) and H{P, Q), and the connections between them is given in [201 [2^1 ■ 

The key result is the next theorem. 

Theorem 1 The relative entropy is jointly convex in P,Q, i.e., when Pj,Qj are 
sequences of positive semi-definite matrices satisfying kei (Qj) C ker(Pj), then 

H{E,^jP„ E.^jQj) < E, ^,H{P,, Q,) (4) 

with Xj > and ^ ■ Xj = 1. 

After proving Theorem ^ in Section 2, we obtain some important corollaries in 
Section El and show in (PUJ) that SSA follows easily from Theorem |2J3, without 
need for any auxiliary spaces or other results. 

Although our main purpose is to present a simple proof of SSA, we added some 
expository material. In Section 01 we compare the argument in Section 12.41 to 
elementary proofs of the Cauchy-Schwarz inequality and give a direct proof of the 
monotonicity of relative entropy. In Section we present three short proofs of the 
Holevo bound, each of which is valid for partial measurements. 

We will frequently use expressions, such as, AlogQ or A^^A, without requiring 
the operator Q to be non-singular. But we only do so when ker(Q) C ker(yl) and 
the expression involved can be well-defined by replacing Q by Q + el and taking 
a limit e -^ 0+. For simplicity and ease of exposition, we proceed as if Q is 
non-singular and refer to [TTj for technical details. 



2 Proof of joint convexity of H[P^ Q) 

2.1 Right and left multiplication 

The proof will use the operations of left and right multiplication by P which are 
defined as Lp[X) = PX and Rp{X) = XP. Both Lp and Rp are linear operators 
on the vector space oi dxd matrices which becomes a Hilbert space when equipped 
with the Hilbert-Schmidt (HS) inner product {A,B) = Ti A'^B. The following 
properties are easy to verify 

a) The operators Lp and Rq commute since 

Lp[RQiA)] = PAQ = RglLpiA)] (5) 

even when P and Q do not commute. 

b) Lp and Rp are invertible if and only if P is non-singular, in which case 
Lp = Lp-i and Rp = Rp-i. 

c) Let Lp denote the adjoint with respect to the HS inner product. It follows 
from 

Ti A^Lp{B) = Ti A^PB = Tr {P^AyP = Ti [L p^ (A)]'^ B . (6) 

that Lp = Lpt and, similarly, Rp = Rpt. Thus, P = P'^ implies that the 
operators Lp and Rp are self-adjoint 

d) When P > 0, the operators Lp and Rp are positive semi-definite, i.e., 

TiA^Lp^A) = Ti A^P{A) > and 
Tr A^Rp{A) = Ti A^AP = Ti APA^ > 0. 



2.2 Strategy 

We reduce the proof of the joint convexity of H{P, Q) to the proof of the following 
two statements. 

I) One can write the relative entropy in the form 

H{P,Q)= rTr{Q-P) \ {Q-P)-^dt (7) 

Jo Lq + tRp (1 + ^) 

II) The map {A, P, Q) i-^ Tr A^ ^ ^^^ A is jointly convex in A, P, Q. 

3 



Letting A = P — Q and using (II) in (I), yields the joint convexity of H{P, Q). 

Note that (0, Qj 0) > for each j implies (0, ^ • Qj 0) > so that kei^Qj) C 
ker(Pj), implies ker(^(5j) C ker(^ Pj). Thus, under the hypothesis of Theo- 
rem 1, all expressions which arise are well-defined. 

2.3 Proof of the integral representation I. 

We begin with the easily verified integral representation 

1 1 



— logw 
which can be rewritten as 



w+t 1+t 



dt (8) 



^ ^w-l)^ 



Next, use a basis in which Q is diagonal, to see that 

Tr(logLQ)(P) =TrLiogQ(P) = TrPlogQ. (10) 

Using this and the fact that Lq and Rp commute, one finds 

H{P,Q) = -Tr(logPp^)(P)-Tr (log Lq)(P) (11) 

= -Tr[log(LQP-i)](P) 
= Tt{1-LqR-p'){P)+ (12) 

where the last step replaced w by LgRp^ in 0. To see why leads to ((Tj), first note 
that 

(LgRp' - 1)(P) = Lq{I) -P = Q-P (13) 

This can be used on the far right in (fT^ and also gives Tr (1 — LqRp^){P) = 
Ty P — Q = 0. Next, use property (b) above to see that 

TtA{LqRp'-1){B) = Ti A{Lq-Rp)oRp\B) (14) 

= Tt[{Lq-Rp){A)]Rp\B) 

Using this with A = I and B = {LqRj} + tiy^{X) gives 

Tt{LqRp'-1){X) = {Q-P)Rp'{LQRp'+tiy\x) 

= {Q-P) j ],j, (X) (15) 

Lq + iKp 

where we used Rp^{LQRp^+tl)~^ = [{LQRp^+tI)Rpy^ = {Lq+tRpy^. Letting 
X = Q - P and inserting (HHI) in ((121) yields (0). QED 



2.4 Proof of the joint convexity II: 

First observe that the properties of Lp and Rq given in Section |23 and the Hilbert- 
Schmidt inner product (jH)), facihtate the evaluation of such expressions as 

Tr [{L, + RQr'l\A)nL,+RQ)-'l\B) = {{L, + Rq)~'I\A), {L, + Rq)-'I\B)) 

= {A,{L, + Rq)-\B)) = TTA^iL, + RQ)-\B). 

Now let et Mj = {Lp^ + tRg^y^/^Aj) - {Lp^ + tRg^y/^iA), Then 

< J^TrMJM, = J2(^^^^^) 
j j 

= J2TTA]{Lp^+tRQ^)-\A,)-TT{j:^A])A (16) 

j 

- Tr At (^^. A,) + Tr AtE,. (Lp^ + ti?Qj A. 

Next, observe that for any matrix W, 

j j 

= L^^p^iW)+tR^^Q^iW). 

Therefore, inserting the choice A = (Lj^ , p^ + tRY^ , q^) (^ ■ Aj) in (fT^ yields 

Tr (^A)'? hn (^^■^^■) - ^.TrAt ^ (A,). (17) 

for any t > 0. Since (xAY— — (xA) = xiA^— 5~(^) ) this implies^ joint 

LxP + RxQ ^ Lp + Rq / 

convexity. QED 



2.5 Remarks 

For simplicity, we used pi|) as the starting point for obtaining the integral repre- 
sentation ((7j). It is equivalent, and customary, to begin instead with a symmetric 
variant of ^, H{P,Q) = -Tr P^/^p^g ^^^^-i^j (pi/2) ^^^ ^-^^^ observe that 

{LQR-p'-l){P'/') = iRp)-'/'{Q-P) 

One advantage to our approach, like that in fT^, is that it is easily extended 
to give a proof of joint convexity when — logiy is replaced by another operator 
convex function. This only changes the weight function in the integral; see [HI El 
for details. Replacing TjipjW by 6{1 — t) in (jTj) yields {Q — P) ^ ^^ {Q — P) which 
is the generalized relative entropy whose Hessian yields the Riemmanian metric 
associated with the Bures metric D^uresj^p^ g^ = [2(l - Tr (y/PQy/Py/^)] ^''^ . 
^If this is not obvious, see the Appendix. 



3 Consequences of joint convexity 

3.1 Monotonicity of relative entropy 

The joint convexity of relative entropy implies the well-known fact ^31 QHl ^1 12^1 
that it decreases under completely positive, trace-preserving (CPT) maps. These 
maps represent quantum channels. We will prove this result by first considering 
two special cases, the partial trace and the projection onto the diagonal, which 
are of sufficient importance to deserve separate statements and have extremely 
elementary proofs. 

Theorem 2 Let ^^'^ denote the map which projects a matrix onto its diagonal, 
and let $ be any CPT map. Then 

a) iJ[<|.QC(p),$QC(^)]<^(p,^) 

b) H[pa,Ja] < H{pab,1ab) 

c) i7[$(p),<l>(7)]<i^(p,7) 

Proof: First, let Z denote the diagonal unitary matrix with elements Zjk = 5jkUJ^ 
with uj = 6^2'^/'^ and note that that (1 - tu^'^"")) ^^. tu-'^'^-") = 1 - cj'^^^*^"") = 0. 
Then, for any matrix X 

Y, Z^XZ-^ = J2 oo'^'^'^^Xkn = d 5fe„a;fc„ (18) 
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which implies that $'*^*-'(p) = ^ ^ • Z^ XZ^^ projects a matrix onto its diagonal. 

Now write a bipartite state pab = Ylijk l^j)(^fc| ® Pjk as a block matrix with 
blocks Pjk- Then 

H{pb,^b) = H{j:,Pkk,EkQkk) < EkH{Pkk,Qkk) (19) 

= H[Y.k\^k){ek\ ®Pkk, Efc|efc)(efc| ®Qkk) 

= H[{1a®^'^''){pab).{Ta®^'^''){1ab)\ (20) 

= H[\Y.^i(^zypAB{i®z)-\ \Y.^i®zy^AB{i(^zy^ 



< 



i 5^ ij[(/ ® zypABii ® z)-^, (/ ® zy^Asii ® zy^] 

= H[pab,1ab] (21) 

where Theorem 1 was used twice in the subadditive form (jTTj) . and the final equality 
uses the fact that conjugation of both arguments by a unitary matrix does not 



change H{p,'y). This proves part (b). When the space Ha is 1-dimensional, the 
inequahty between ()20|) and ()21|) yields part (a). 

To prove (c) fix the ancilla representation of Lemma El and let 



ctab 



Uab P ® \4>b) {<Pb\ U\^ Tab = Uab 7 ® |0b) (0b| U\ 



AB- 



Then <|)(p) = aA, ^{l) = t~a and, since Uab is unitary, H{p,'y) = H{aAB,'TAB)- 
Thus, it follows from part (b) that 

if [$(p), $(7)] = H{aA, ta) < H{aAB, tab = H{p, 7). QED (22) 



3.2 Convexity corollaries 

The conditional entropy is given by 

SipAB) - SipA) = -H{pAB, PA ® -J) + logrf, (23) 

It then follows immediately from the joint convexity of H{p,'y) that 

Pab ^ S{pab) - S{pa) is concave. (24) 

Moreover, for any CPT map $, the map 

p H^ S{p) — S[^{p)] is concave. (25) 

This follows from (j211). Use the same notation as in part (c) of the previous section 
and observe that S{p) — S[^{p)] = S{(Tab) — S{(Tb)- 

3.3 Completing the proof of SSA 

The SSA inequality Q follows immediately from Corollary |2b with 7 = pAc ® ^I- 
We write this out explicitly using 



S{pa) - S{pab) = H{pAB, Pa ® ^/) - log d 

< H{pABC,PAc®-J)-\ogd (26) 

= S{pabc)-S{pab). QED 



There is another form of SSA which follows easily from ()24|) . namely, 

S{pb) + S{pd) < S{pab) + S{pad). (27) 

To prove this first consider 

F{pabd) = S{pab) + S{pad) - S{pb) - S{pd). (28) 



When pabd is pure, it follows from Lemma El that S{pab) = S{pd) and S{pad) = 
S{pb)- Thus, F{pabd) = for pure states. Since F{pabc) is the sum of two 
functions S{pab) — S{pb) and +S{pad) — S{pd) which are concave by (j^ . the map 
Pabd *— ^ F{pabd) is also concave. Since any mixed Pabd is a convex combination 
of pure states, F{pabd) > 0, which implies fITTj) . 

By Lemma one can purifiy pabc ot Pabd to Pabcd and use Lemma El to 
show that (P7|) holds if and only if ^ does. 

4 Remarks on Cauchy-Schwarz type inequalities 

4.1 Elementary proof strategy 

The elementary vector version of the Cauchy-Schwarz inequality states that 

|$^tJ,«;,|^<(5^Kp)(5^Kp) (29) 

k k k 

When Vk = p^. iWk = Pk '^k, this can be written as 

T.k'^k^^—T.k^^k < J2k^k—ak. (30) 

2^k"k Pk 

In ^T], Lieb and Ruskai proved an operator version of (J3U|) . namely that 

zJfc^fe Y^ p zJfc^fc - zJfc^fe-^^fc- (31) 

l^k^k -ffe 

holds as an operator inequality. This is equivalent to the statement that the map 
(A, P) I— > A'^P'^A is jointly operator convex. The proof in Section 17!^ is based on 
that in PQ] which (although published later) actually preceded the proof of SSA. 
However, without the additional ingredient of Lp and Rq, which are motivated by 
Araki's subsequent introduction jlj of the relative modular operator, the results 
in pi] are not sufficient to prove SSA. The recognition that the argument in ^I] 
could be modified to prove SSA took another 25 years jH). 

The proofs in both [TT] and Section lT^ are variants of the standard strategy used 
to prove the elementary inequahty (j^ . One observes that Ylk 1"^^ + Aw^p > and 
shows that the minimizing choice A = — (X]fc'^'c)/(2Sfc'^fc) yields (|^. In Section 
12.41 the operator A plays the role of A. 



4.2 Schwarz inequalities for CP maps 

We now consider Schwarz type inequalities involving completely positive (CP) 
maps. Let $ be a CP map written in Kraus form $(P) = "^j KjPK-. It then 
follows from (|^ that 

mA)]^^—^(A) = Yk.A^K] ^ rYK.AK] 

^ T.^3^'\^^\ = ^[^'\^- (32) 
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By making the replacements A -^ B'^ A and P -^ B^B in ()32|) one finds 

This inequality is proved in [TT] using the Stinespring [T71I221 representation.. 

Choi |3j realized that ()H2|1 and (jHH|) hold under the weaker condition that $ is 

2-positive. His approach is quite different, and based on the fact that a 2 x 2 block 

f P C\ 
matrix j „x ^ ) is positive semi-definite if and only if C = \fPX\fQ with X a 

contraction, i.e., X'^X < I. When P,Q are both non-singular, this is equivalent 

to C^P^^C < Q. The 2-positivity of $ says that I ^/pf ^x rhfu'tui ] ^^ positive 

semi-definite. Applying the condition above yields ()33|) . 



4.3 Monotonicity of relative entropy 

In jHj a strategy similar to that in Section l2^ was used to give a direct proof of the 
montonicity of relative entropy under CPT maps without using an auxiliary space. 
One begins as before, but with M = {Lp + tRQ)-^/\A) - {Lp + tRQy/^[${X)] 
and X = {L^^p) + tR^Q))~\<^{A)]. Then M^M > implies 

TrAt ^ A - 2Tr [$(A)]t- i— $(A) (34) 

Lp + iKq -tv$(p) + tn,i,(Q) 

+ Tr[$(X)]t[Lp + tPQ]8(X)>0 

Comparing the last two terms requires a bit more work and the use of ()33|). Since 
$ trace preserving implies $(J) = /, (jHSI) implies [$(X)]^<I>(X) < ^{X^X) and 



$(X)[$(X)]'^ < $(XX"I') . Then, using the cychcity of the trace, one finds 

Tr[$(X)]^[Lp + ti?Q]$(X) = Tr$(X)[$(X)]^P + tTr[$(X)]t$(X)g 

< TT[${XX^)P + t${X^X)Q] (35) 

= Tr [XX^$(P) + tX^X$(g)] 

= TrXt[L$(p)+ti?$(Q)]X (36) 

= mA)]'j ^T^HA) 

Using this in (jH^ allows one to combine the last two terms as before. Substituting 
the resulting inequality in (|7j) yields part (c) of Theorem 2. 

5 Holevo bounds for partial measurements 

In order to state the Holevo bound, we introduce some notation. Let £ denote an 
ensemble {tTj, pj} with ttj > 0, ^ . ttj = 1 and each pj a density matrix. The Holevo 
X-quantity is defined as 

x{S) = s[j2^,p,)-Y^n,S{p,). (37) 

i j 

A set of positive semi-definite operators {Ma} satisfying ^^ Ma is called a positive 
operator valued measurement (POVM) and denoted Ai. Every POVM defines a 
CPT map ^M which takes p i— > ^^(TrpMa)|a)(a|. The usual Holevo bound states 
that 

Xi£) > x[^Mi£)] ^ 5[^7r,$^(p,)] -J2^j^MiPj)- (38) 

j j 

where $_a4(^) denotes the ensemble in which each pj is replaced by $;k(pj). 

There is now an extensive literature on bounds involving partial measurements. 
Consider the situation in which two parties, Alice and Bob, share an ensemble of 
(possibly entangled) states {Tijypf^} on Ha ® Hb, on which one of the parties 
makes a measurement. In such cases, one expects a bound of the form 

Xl^^'^) >X[(/®$^J(^-^^)] >X[($^.®<^^J(^-^^)]. (39) 

We observe that three simple strategies for proving (j38|) easily extend to (j39j) . 
The first proof uses the observation of Yuen and Ozawa PT that 

-^(Pav) - Yl ^J^(Pj) = Yl ^jHiPj, Pav)- (40) 
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where pav = Tlij'^jPj- ^^ follows from part (c) of Theorem |21 that 

H[{^M, ® $^ J(pf ), {^M, ® $A4j(pf/)] (41) 

which is equivalent to (jH^I) . 

The next proof uses the fact that x{^) can be regarded as a form of mutual 
information between the quantum states pj and their classical probability distri- 
bution 71 j. Let 7(3(7 = J2j '^jPj ® \J){J\ be a density matrix on Hq ® He- Then, as 
was observed in j6j, 

Xi£) = 5(Pav) -J^TT^Sip,) = H[^Qc,lQ®lc). (42) 

i 

Then part (c) of Theorem |21 gives 

i7[(<l>®/)(7Qc),($®/)(7Q®7c)] <^(7qc,7q®7c). (43) 

which is equivalent to (jHHj) . To obtain (|H^. let 7-^q = 7-^^ TY^ and observe that 



H[{^®^®I) i-fABc), ($ ® $ ® /) (7AB ® 7C7) 

< if[(<l>®/®/)(7^Bc),('^®/®/)(7AB®7c)] (44) 

< H{jABC,lAB<S)Jc)- 

The final proof uses the observation in |l2j, that the Holevo bound (|37p is 
equivalent to the statement that p ^— *> S{p) — S[^m{p)] is convex, which is a special 
case of (I2S1). Thus, the bound (jH^ follows immediately from (j^ with $ replaced 
first by /^ (g) $_A/(g and then by ^Ma ® "^Xs- 

A Appendix 

Let A = Y^f^ ^k\4'k){4'k\ be a self-adjoint matrix with eigenvalues A^ in the domain 
of the function f{w). Then we define f{A) = ^^ f{Xk)\(pk){<Pk\- This is equivalent 
to any other reasonable definition and implies that substituting LgRj,^ for w to 
obtain (jT^ is fully justified; there is no need to explicitly find the eigenvalues and 
eigenvectors of LgRj,^. 

If a function F satisfies F{xA) = xF{A) then convexity is equivalent to subad- 
ditivity. First, observe that when F is also convex 

lF{A + B)=g{l[A + B]) < iF(A) + lF{B). (45) 
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Conversely, if F is subadditive, then 

F[xA + (1 - x)B\ < F{xA) + F[(l - x)B] = xF{A) + (1 - x)F{B). (46) 

Although the relative entropy H{P, Q) is usually considered for density matrices, 
Q defines it more broadly. Since Klein's inequahty [T3l[inil2n] says that H{P, Q) > 
TrP — TrQ, it follows that H{P,Q) > when TrP = TrQ. It is easy to verify 
that H{xP,xQ) = xH{P,Q) for x > 0. Therefore, by the observations above, (JH) 
is equivalent to 

H{E,P,, E,Q.) < E, H{,P„ Q,). (47) 

For completeness, we also state some well-known results used in proving corol- 
laries to the joint convexity and SSA. None are needed to obtain a proof of SSA. 

Lemma 3 (Ancilla representation) Any CPT map $ : M^ ^—>- M^ can be repre- 
sented using an auxiliary space T-Cb as 

<l>(p) = TrBUAB P ® \^b) {<Pb\ U\j, (48) 

where Uab is unitary and \(J)b){4'b\ is a pure state. If (Tab = Uab P®\4'b){4'b\ Uabj 
then $(p) = ■yA and S^cab) = S{p). 

This is essentially a corollary to the Stinespring representation theorem ^22^. It 
was introduced in the form mused here by Lindblad ^3] who made the observation 
about entropy and used it to give the first proof of Theorem 121:;. For an overview 
of representation theorems, see Chapter 2 of Paulsen jTTj ; for short accessible sum- 
maries, see the appendices to [H HI El as well as Section III.D of 1201- The term 
"ancilla representation" is introduced in |H1. 

The following well-known, and easily proved, facts go back at least to 0. For 
further references and discussion see ^ HH 1201 • 

Lemma 4 When pab = \'ipAB){4'AB\ is a pure state, its reduced density matrices 
Pa and pb have the same non-zero eigenvalues and S{pa) = S{pb)- 

Lemma 5 Given a density matrix p in M^ of rank m, one can find a pure pab in 
Md ® Mm with Pa = P- 
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